Automatic Speech Recognition in Adverse Acoustic Conditions

نویسندگان

Febe de Wet

Johan de Veth

Bert Cranen

چکیده

Automatic Speech Recognition (ASR) technology has reached maturity to the extent that it can be used successfully in various applications. However, it is by no means the “solved problem ” that some marketing campaigns are promoting it to be. One o f the biggest challenges that operational ASR systems are faced with, is to maintain recognition performance in adverse acoustic conditions. The training procedures o f most ASR systems yield recognisers with a relatively rigid image o f the world: Only those acous tic variations that actually occurred in the training data are accounted for. Since training data is usually clean (in the sense that care is taken to avoid noisy recording environments, channel noise, etc.), noise sources which are present when the system is operational result in a mismatch between the training and the test conditions. Such a mismatch may reduce recognition performance quite significantly. The aim of this research is to determine the extent to which the robustness o f ASR systems against mismatched training and test conditions may be increased using acoustic backing-off as an im plementation o f Missing Feature Theory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

A comparison of LPC and FFT-based acoustic features for noise robust ASR

Within the context of robust acoustic features for automatic speech recognition (ASR), we evaluated mel-frequency cepstral coefficients (MFCCs) derived from two spectral representation techniques, i.e. the fast Fourier transform (FFT) and linear pre dictive coding (LPC). ASR systems based on the two feature types were tested on a digit recognition task using continuous density hidden Markov ph...

متن کامل

A Hybrid Method for Automatic Speech Recognition Performance Improvement in Real World Noisy Environment

It is a well known fact that, speech recognition systems perform well when the system is used in conditions similar to the one used to train the acoustic models. However, mismatches degrade the performance. In adverse environment, it is very difficult to predict the category of noise in advance in case of real world environmental noise and difficult to achieve environmental robustness. After do...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Automatic Speech Recognition in Adverse Acoustic Conditions

نویسندگان

چکیده

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Allophone-based acoustic modeling for Persian phoneme recognition

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

A comparison of LPC and FFT-based acoustic features for noise robust ASR

A Hybrid Method for Automatic Speech Recognition Performance Improvement in Real World Noisy Environment

عنوان ژورنال:

اشتراک گذاری